AMYCNE: Confident copy number assessment using whole genome sequencing data
نویسندگان
چکیده
Copy number variations (CNVs) within the human genome have been linked to a diversity of inherited diseases and phenotypic traits. The currently used methodology to measure copy numbers has limited resolution and/or precision, especially for regions with more than 4 copies. Whole genome sequencing (WGS) offers an alternative data source to allow for the detection and characterization of the copy number across different genomic regions in a single experiment. A plethora of tools have been developed to utilize WGS data for CNV detection. None of these tools are designed specifically to accurately estimate copy numbers of complex regions in a small cohort or clinical setting. Herein, we present AMYCNE (automatic modeling functionality for copy number estimation), a CNV analysis tool using WGS data. AMYCNE is multifunctional and performs copy number estimation of complex regions, annotation of VCF files, and CNV detection on individual samples. The performance of AMYCNE was evaluated using AMY1A ddPCR measurements from 86 unrelated individuals. In addition, we validated the accuracy of AMYCNE copy number predictions on two additional genes (FCGR3A and FCGR3B) using datasets available through the 1000 genomes consortium. Finally, we simulated levels of mosaic loss and gain of chromosome X and used this dataset for benchmarking AMYCNE. The results show a high concordance between AMYCNE and ddPCR, validating the use of AMYCNE to measure tandem AMY1 repeats with high accuracy. This opens up new possibilities for the use of WGS for accurate copy number determination of other complex regions in the genome in small cohorts or single individuals.
منابع مشابه
I-37: Establishing High Resolution Genomic Profiles of Single Cells Using Microarray and Next-Generation Sequencing Technologies
The nature and pace of genome mutation is largely unknown. Standard methods to investigate DNA-mutation rely on arraying or sequencing DNA from a population of cells, hence the genetic composition of individual cells is lost and de novo mutation in cell(s) is concealed within the bulk signal. We developed methods based on (SNP-) arraying and next-generation sequencing of single-cell whole-genom...
متن کاملO-38: Concurrent Whole-Genome Haplotyping and Copy-Number Profiling of Single Cells
Background Methods for haplotyping and DNA copynumber typing of single cells are paramount for studying genomic heterogeneity and enabling genetic diagnosis. Before analyzing the DNA of a single cell by microarray or next-generation sequencing, a whole-genome amplification (WGA) process is required, but it substantially distorts the frequency and composition of the cell’s alleles. As a conseque...
متن کاملI-44: Concurrent Whole-Genome Haplotyping and Copy-Number Profiling of Single Cells
Background Methods for haplotyping and DNA copynumber typing of single cells are paramount for studying genomic heterogeneity and enabling genetic diagnosis. Before analyzing the DNA of a single cell by microarray or next-generation sequencing, a whole-genome amplification (WGA) process is required, but it substantially distorts the frequency and composition of the cell’s alleles. As a conseque...
متن کاملCLImAT: accurate detection of copy number alteration and loss of heterozygosity in impure and aneuploid tumor samples using whole-genome sequencing data
MOTIVATION Whole-genome sequencing of tumor samples has been demonstrated as an efficient approach for comprehensive analysis of genomic aberrations in cancer genome. Critical issues such as tumor impurity and aneuploidy, GC-content and mappability bias have been reported to complicate identification of copy number alteration and loss of heterozygosity in complex tumor samples. Therefore, effic...
متن کاملA survey of tools for variant analysis of next-generation genome sequencing data
Recent advances in genome sequencing technologies provide unprecedented opportunities to characterize individual genomic landscapes and identify mutations relevant for diagnosis and therapy. Specifically, whole-exome sequencing using next-generation sequencing (NGS) technologies is gaining popularity in the human genetics community due to the moderate costs, manageable data amounts and straight...
متن کامل